What is happiness

Happiness is an emotional state characterized by feelings of joy, satisfaction, contentment, and fulfillment. While happiness has many different definitions, it is often described as involving positive emotions and life satisfaction.

When most people talk about happiness, they might be talking about how they feel in the present moment, or they might be referring to a more general sense of how they feel about life overall.
Because happiness tends to be such a broadly defined term, psychologists and other social scientists typically use the term ‘subjective well-being’ when they talk about this emotional state. Just as it sounds, subjective well-being tends to focus on an individual’s overall personal feelings about their life in the present.

The aim of this project

The aim of this project is to observe the data from the World Happiness Report for several years and to exploratory data analysis to search for the difference in happiness levels between countries and try to postulate some hypothesis

About Dataset




Context

The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012, the second in 2013, the third in 2015, and the fourth in the 2016 Update. The World Happiness 2017, which ranks 155 countries by their happiness levels, was released at the United Nations at an event celebrating International Day of Happiness on March 20th. The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.

Content

The happiness scores and rankings use data from the Gallup World Poll. The scores are based on answers to the main life evaluation question asked in the poll. This question, known as the Cantril ladder, asks respondents to think of a ladder with the best possible life for them being a 10 and the worst possible life being a 0 and to rate their own current lives on that scale. The scores are from nationally representative samples for the years 2013-2016 and use the Gallup weights to make the estimates representative. The columns following the happiness score estimate the extent to which each of six factors – economic production, social support, life expectancy, freedom, absence of corruption, and generosity – contribute to making life evaluations higher in each country than they are in Dystopia, a hypothetical country that has values equal to the world’s lowest national averages for each of the six factors. They have no impact on the total score reported for each country, but they do explain why some countries rank higher than others.

Inspiration

What countries or regions rank the highest in overall happiness and each of the six factors contributing to happiness? How did country ranks or scores change between the 2015 and 2016 as well as the 2016 and 2017 reports? Did any country experience a significant increase or decrease in happiness?

What is Dystopia?

Dystopia is an imaginary country that has the world’s least-happy people. The purpose in establishing Dystopia is to have a benchmark against which all countries can be favorably compared (no country performs more poorly than Dystopia) in terms of each of the six key variables, thus allowing each sub-bar to be of positive width. The lowest scores observed for the six key variables, therefore, characterize Dystopia. Since life would be very unpleasant in a country with the world’s lowest incomes, lowest life expectancy, lowest generosity, most corruption, least freedom and least social support, it is referred to as “Dystopia,” in contrast to Utopia.

What are the residuals?

The residuals, or unexplained components, differ for each country, reflecting the extent to which the six variables either over- or under-explain average 2014-2016 life evaluations. These residuals have an average value of approximately zero over the whole set of countries. Figure 2.2 shows the average residual for each country when the equation in Table 2.1 is applied to average 2014- 2016 data for the six variables in that country. We combine these residuals with the estimate for life evaluations in Dystopia so that the combined bar will always have positive values. As can be seen in Figure 2.2, although some life evaluation residuals are quite large, occasionally exceeding one point on the scale from 0 to 10, they are always much smaller than the calculated value in Dystopia, where the average life is rated at 1.85 on the 0 to 10 scale.

What do the columns succeeding the Happiness Score(like Family, Generosity, etc.) describe?

The following columns: GDP per Capita, Family, Life Expectancy, Freedom, Generosity, Trust Government Corruption describe the extent to which these factors contribute in evaluating the happiness in each country. The Dystopia Residual metric actually is the Dystopia Happiness Score(1.85) + the Residual value or the unexplained value for each country as stated in the previous answer.

If you add all these factors up, you get the happiness score so it might be un-reliable to model them to predict Happiness Scores.

d2015 <- read.csv("datasets/2015.csv")
d2016 <- read.csv("datasets/2016.csv")
d2017 <- read.csv("datasets/2017.csv")
d2018 <- read.csv("datasets/2018.csv")
d2019 <- read.csv("datasets/2019.csv")

Displaying the data

head(d2015)
Data frame is now printed using kable.
Country Region Happiness.Rank Happiness.Score Standard.Error Economy..GDP.per.Capita. Family Health..Life.Expectancy. Freedom Trust..Government.Corruption. Generosity Dystopia.Residual
Switzerland Western Europe 1 7.587 0.03411 1.39651 1.34951 0.94143 0.66557 0.41978 0.29678 2.51738
Iceland Western Europe 2 7.561 0.04884 1.30232 1.40223 0.94784 0.62877 0.14145 0.43630 2.70201
Denmark Western Europe 3 7.527 0.03328 1.32548 1.36058 0.87464 0.64938 0.48357 0.34139 2.49204
Norway Western Europe 4 7.522 0.03880 1.45900 1.33095 0.88521 0.66973 0.36503 0.34699 2.46531
Canada North America 5 7.427 0.03553 1.32629 1.32261 0.90563 0.63297 0.32957 0.45811 2.45176
Finland Western Europe 6 7.406 0.03140 1.29025 1.31826 0.88911 0.64169 0.41372 0.23351 2.61955
head(d2016)
Data frame is now printed using kable.
Country Region Happiness.Rank Happiness.Score Lower.Confidence.Interval Upper.Confidence.Interval Economy..GDP.per.Capita. Family Health..Life.Expectancy. Freedom Trust..Government.Corruption. Generosity Dystopia.Residual
Denmark Western Europe 1 7.526 7.460 7.592 1.44178 1.16374 0.79504 0.57941 0.44453 0.36171 2.73939
Switzerland Western Europe 2 7.509 7.428 7.590 1.52733 1.14524 0.86303 0.58557 0.41203 0.28083 2.69463
Iceland Western Europe 3 7.501 7.333 7.669 1.42666 1.18326 0.86733 0.56624 0.14975 0.47678 2.83137
Norway Western Europe 4 7.498 7.421 7.575 1.57744 1.12690 0.79579 0.59609 0.35776 0.37895 2.66465
Finland Western Europe 5 7.413 7.351 7.475 1.40598 1.13464 0.81091 0.57104 0.41004 0.25492 2.82596
Canada North America 6 7.404 7.335 7.473 1.44015 1.09610 0.82760 0.57370 0.31329 0.44834 2.70485
head(d2017)
Data frame is now printed using kable.
Country Happiness.Rank Happiness.Score Whisker.high Whisker.low Economy..GDP.per.Capita. Family Health..Life.Expectancy. Freedom Generosity Trust..Government.Corruption. Dystopia.Residual
Norway 1 7.537 7.594445 7.479556 1.616463 1.533524 0.7966665 0.6354226 0.3620122 0.3159638 2.277027
Denmark 2 7.522 7.581728 7.462272 1.482383 1.551122 0.7925655 0.6260067 0.3552805 0.4007701 2.313707
Iceland 3 7.504 7.622031 7.385970 1.480633 1.610574 0.8335521 0.6271626 0.4755402 0.1535266 2.322715
Switzerland 4 7.494 7.561772 7.426228 1.564980 1.516912 0.8581313 0.6200706 0.2905493 0.3670073 2.276716
Finland 5 7.469 7.527542 7.410458 1.443572 1.540247 0.8091577 0.6179509 0.2454828 0.3826115 2.430182
Netherlands 6 7.377 7.427426 7.326574 1.503945 1.428939 0.8106961 0.5853845 0.4704898 0.2826618 2.294804
head(d2018)
Data frame is now printed using kable.
Overall.rank Country.or.region Score GDP.per.capita Social.support Healthy.life.expectancy Freedom.to.make.life.choices Generosity Perceptions.of.corruption
1 Finland 7.632 1.305 1.592 0.874 0.681 0.202 0.393
2 Norway 7.594 1.456 1.582 0.861 0.686 0.286 0.340
3 Denmark 7.555 1.351 1.590 0.868 0.683 0.284 0.408
4 Iceland 7.495 1.343 1.644 0.914 0.677 0.353 0.138
5 Switzerland 7.487 1.420 1.549 0.927 0.660 0.256 0.357
6 Netherlands 7.441 1.361 1.488 0.878 0.638 0.333 0.295
head(d2019)
Data frame is now printed using kable.
Overall.rank Country.or.region Score GDP.per.capita Social.support Healthy.life.expectancy Freedom.to.make.life.choices Generosity Perceptions.of.corruption
1 Finland 7.769 1.340 1.587 0.986 0.596 0.153 0.393
2 Denmark 7.600 1.383 1.573 0.996 0.592 0.252 0.410
3 Norway 7.554 1.488 1.582 1.028 0.603 0.271 0.341
4 Iceland 7.494 1.380 1.624 1.026 0.591 0.354 0.118
5 Netherlands 7.488 1.396 1.522 0.999 0.557 0.322 0.298
6 Switzerland 7.480 1.452 1.526 1.052 0.572 0.263 0.343



First 5 countries

First let’s see the top 5 for the years 2015, 2016. 2017, 2018, and 2019.

d2015[,c("Country", "Happiness.Rank", "Happiness.Score")] |> head(5)
##       Country Happiness.Rank Happiness.Score
## 1 Switzerland              1           7.587
## 2     Iceland              2           7.561
## 3     Denmark              3           7.527
## 4      Norway              4           7.522
## 5      Canada              5           7.427
d2016[,c("Country", "Happiness.Rank", "Happiness.Score")] |> head(5)
##       Country Happiness.Rank Happiness.Score
## 1     Denmark              1           7.526
## 2 Switzerland              2           7.509
## 3     Iceland              3           7.501
## 4      Norway              4           7.498
## 5     Finland              5           7.413
d2017[,c("Country", "Happiness.Rank", "Happiness.Score")] |> head(5)
##       Country Happiness.Rank Happiness.Score
## 1      Norway              1           7.537
## 2     Denmark              2           7.522
## 3     Iceland              3           7.504
## 4 Switzerland              4           7.494
## 5     Finland              5           7.469
d2018[,c("Country.or.region", "Overall.rank", "Score")] |> head(5)
##   Country.or.region Overall.rank Score
## 1           Finland            1 7.632
## 2            Norway            2 7.594
## 3           Denmark            3 7.555
## 4           Iceland            4 7.495
## 5       Switzerland            5 7.487
d2019[,c("Country.or.region", "Overall.rank", "Score")] |> head(5)
##   Country.or.region Overall.rank Score
## 1           Finland            1 7.769
## 2           Denmark            2 7.600
## 3            Norway            3 7.554
## 4           Iceland            4 7.494
## 5       Netherlands            5 7.488

We can clearly see that the nordic countries like Finland, Norway and Denkmare are constantly at the top.

Last 5 countries

Now let’s see at the 5 countries that score the least.

d2015[,c("Country", "Happiness.Rank", "Happiness.Score")] |> tail(5)
##     Country Happiness.Rank Happiness.Score
## 154  Rwanda            154           3.465
## 155   Benin            155           3.340
## 156   Syria            156           3.006
## 157 Burundi            157           2.905
## 158    Togo            158           2.839
d2016[,c("Country", "Happiness.Rank", "Happiness.Score")] |> tail(5)
##         Country Happiness.Rank Happiness.Score
## 153       Benin            153           3.484
## 154 Afghanistan            154           3.360
## 155        Togo            155           3.303
## 156       Syria            156           3.069
## 157     Burundi            157           2.905
d2017[,c("Country", "Happiness.Rank", "Happiness.Score")] |> tail(5)
##                      Country Happiness.Rank Happiness.Score
## 151                   Rwanda            151           3.471
## 152                    Syria            152           3.462
## 153                 Tanzania            153           3.349
## 154                  Burundi            154           2.905
## 155 Central African Republic            155           2.693
d2018[,c("Country.or.region", "Overall.rank", "Score")] |> tail(5)
##            Country.or.region Overall.rank Score
## 152                    Yemen          152 3.355
## 153                 Tanzania          153 3.303
## 154              South Sudan          154 3.254
## 155 Central African Republic          155 3.083
## 156                  Burundi          156 2.905
d2019[,c("Country.or.region", "Overall.rank", "Score")] |> tail(5)
##            Country.or.region Overall.rank Score
## 152                   Rwanda          152 3.334
## 153                 Tanzania          153 3.231
## 154              Afghanistan          154 3.203
## 155 Central African Republic          155 3.083
## 156              South Sudan          156 2.853
library('ggplot2')



Intresting correlations


Overall

Let’s see overall some correlations between some columns of the dataset and the score.

fit <- lm(Score ~ GDP.per.capita + Social.support + Healthy.life.expectancy + Freedom.to.make.life.choices + Generosity + Perceptions.of.corruption, data=d2019)
avPlots(fit, ask=FALSE)

Now let’s go into the specifics of some intresing correlation

GDP and happiness score

It’s intresting to see some correlations between the data in order to understand how there could be massive differences between the first and the last five countries . For example, the following is the correlation between the life expectancy and the happiness score, displaying also the life expectancy. We can see that the correlation is quite stable across time.

ggplot(d2015) + geom_point(aes(x=Happiness.Score, y=Economy..GDP.per.Capita. , col=Health..Life.Expectancy.))

ggplot(d2016) + geom_point(aes(x=Happiness.Score, y=Economy..GDP.per.Capita. , col=Health..Life.Expectancy.))

ggplot(d2017) + geom_point(aes(x=Happiness.Score, y=Economy..GDP.per.Capita. , col=Health..Life.Expectancy.))

ggplot(d2018) + geom_point(aes(x=Score, y=GDP.per.capita , col=Healthy.life.expectancy))

ggplot(d2019) + geom_point(aes(x=Score, y=GDP.per.capita , col=Healthy.life.expectancy))


Generosity and GDP

It’s intrenting to see if when people are richer they care less about the generosity of others. In order to answer this question it’s sufficient to display the correlation between Generosity and GDP.

g <- ggplot(d2019, aes(Generosity, GDP.per.capita))
g + geom_point() + 
  geom_smooth(method="lm", se=F) 
## `geom_smooth()` using formula 'y ~ x'

It seems that people are more sensitive to generosity when they have less resources, which makes sense.

Perception of corruption and score

Now let’s see how much the perception of corruption influences the happiness score

g <- ggplot(d2019, aes(Perceptions.of.corruption, Score))
g + geom_point() + 
  geom_smooth(method="lm", se=F) 
## `geom_smooth()` using formula 'y ~ x'

Freedom to make life choices and score

Now let’s see how much the perception of corruption influences the happiness score

g <- ggplot(d2019, aes(Freedom.to.make.life.choices, Score))
g + geom_point() + 
  geom_smooth(method="lm", se=F) 
## `geom_smooth()` using formula 'y ~ x'

It doesn’t surprise that the more freedom to make life choices, the more people are happy.

Grouping by Region

Happiness Score by Region

Now let’s see the average happiness score by Region (year 2015)

r <- d2015 %>%
  select(Region, Happiness.Score) %>%
  group_by(Region) %>%
  summarise(n = n(),
            happinessMean = mean(Happiness.Score),
            happinessMedian = median(Happiness.Score),
            happinessStandard = sd(Happiness.Score))

ggplot(r, aes(x = fct_reorder(Region, happinessMedian), y = happinessMean, fill = happinessMean)) +
  geom_bar(stat = "Identity", show.legend = F) +
  scale_fill_gradient(low = "red", high = "green") +
  labs(y="", x="",fill = "Score",
       title = "Average Happiness score grouped by Region") +
  theme_light(base_size = 9) + coord_flip()

The result shows that the region where people feel more happy is “Australia and New Zealand”

GDP by Region

Now let’s see the average GDP per capita score by Region (year 2015)

r <- d2015 %>%
  select(Region, Economy..GDP.per.Capita.) %>%
  group_by(Region) %>%
  summarise(n = n(),
            gdpMean = mean(Economy..GDP.per.Capita.),
            gdpMedian = median(Economy..GDP.per.Capita.),
            happinessStandard = sd(Economy..GDP.per.Capita.))

ggplot(r, aes(x = fct_reorder(Region, gdpMedian), y = gdpMean, fill = gdpMean)) +
  geom_bar(stat = "Identity", show.legend = F) +
  scale_fill_gradient(low = "black", high = "blue") +
  labs(y="", x="",fill = "Economy..GDP.per.Capita.",
       title = "Average GDP score grouped by Region") +
  theme_light(base_size = 9) + coord_flip()

Despite Australia and New Zealand is at the top for happiness score it is not for the GDP score per capita. As we can see, the first for GDP is North America, followed by Western Europe.

Distribution of density

Let’s have a look at the distribution of density of the happiness score

ggplot(d2019,aes(Score))+
  geom_density()

median(d2019$Score)
## [1] 5.3795
mean(d2019$Score)
## [1] 5.407096
sd(d2019$Score)
## [1] 1.11312

As most complex phenomena, the distribution is a Gaussian curve. The unusual thing is that it is quite flat at the mean. Furthermore we can see that both the mean and the median are above the theoretical mean (that would be (10-0)/2=5). This is an optimistic data.

Now let’s see if between 2015 an 2019 the mean and the median have shifted

mean(d2015$Happiness.Score)
## [1] 5.375734
mean(d2016$Happiness.Score)
## [1] 5.382185
mean(d2017$Happiness.Score)
## [1] 5.354019
mean(d2018$Score)
## [1] 5.375917
mean(d2019$Score)
## [1] 5.407096
mean(d2019$Score)-mean(d2015$Happiness.Score)
## [1] 0.03136198

Conclusions

When will we all reach 10

While the technical answer is never, we can see that from 2015 and and 2019 there is been an overall improvement of 0.03136198 in the level of happiness of people. This theoretically and utopically would mean that, assuming a costant rate of improvement in (10-mean(d2019\(Score))/(mean(d2019\)Score)-mean(d2015$Happiness.Score))*(2019-2015)=585 years years the mean would reach 10.0

How to reach that

We saw the clear correlation between happiness and the GDP per capita. One likely way to improve the happiness of most people would be to improve the average level of wealth. Of course, while doing so, there are other things to improve such as removing the corruption and enhancing freedom.

(10-mean(d2019$Score))/(mean(d2019$Score)-mean(d2015$Happiness.Score))*(2019-2015)
## [1] 585.7926

Bibliography
* what is happiness
* Dataset